

## **Attachment D**

**Pages MU0023213, MU0023250-52, MU0023254-55, MU0023259, and MU0023308-14  
of Exhibit A2 included with the  
Declaration of Craig Hansen Under 37 CFR § 1.313 filed on September 18, 2009**

**(14 pages)**

# microunity

## Terpsichore System Architecture

REGISTERED CONFIDENTIAL AND PROPRIETARY INFORMATION OF  
MICROUNITY SYSTEMS ENGINEERING, INC., NOT INTENDED FOR  
DISTRIBUTION OUTSIDE OF MICROUNITY WITHOUT THE EXPRESS  
WRITTEN CONSENT OF AN OFFICER OR DIRECTOR OF MICROUNITY.

~~REDACTED~~

Copy Number: 247

**REDACTED**

Issued To: \_\_\_\_\_  
Final Test

Issued By: \_\_\_\_\_  
(MicroUnity officer or director)

Craig Hansen  
Chief Architect  
MicroUnity Systems Engineering, Inc.  
255 Caspian Drive  
Sunnyvale, CA 94089-1015  
Tel: (408) 734-8100 Fax: (408) 734-8136  
EMail: craig@microunity.com

MU 0023213

Highly Confidential

callee (non-leaf):

|                |              |
|----------------|--------------|
| S.64           | sp,off(dp)   |
| L.64           | sp,off(dp)   |
| S.64           | link,off(sp) |
| S.64           | dp,off(sp)   |
| ... (using dp) |              |
| L.64           | link,off(sp) |
| L.64           | dp,off(sp)   |
| L.64           | sp,off(dp)   |
| B.DOWN         | link         |

callee (leaf):

|                |      |
|----------------|------|
| ... (using dp) |      |
| B.DOWN         | link |

The callee, if it uses a stack for local variable allocation, cannot necessarily trust the value of the sp passed to it, except as a region to receive parameters held in memory.

### Pipeline Organization

Terpsichore performs all instructions as if executed one-by-one, in-order, with precise exceptions always available. Consequently, code which ignores the subsequent discussion of Terpsichore pipeline implementations will still perform correctly. However, the highest performance of the Terpsichore processor is achieved only by matching the ordering of instructions to the characteristics of the pipeline. In the following discussion, the general characteristics of all Terpsichore implementations precedes discussion of specific choices for specific implementations.

#### Super-string Pipeline

Terpsichore is designed to fetch and execute several instructions in each clock cycle. For a particular ordering of instruction types, one instruction of each type may be issued in a single clock cycle. The ordering required is A, L, E, S, B; in other words, a register-to-register address calculation, a memory load, a register-to-register data calculation, a memory store, and a branch. Because of the organization of the pipeline, each of these instructions may be serially dependent. Instructions of type E include the fixed-point execute-phase instructions as well as floating-point and digital signal processing instructions. We call this form of pipeline organization "super-string,"<sup>4</sup> because of the ability to issue a string of dependent instructions in a single clock cycle, as distinguished from super-scalar or super-pipelined organizations, which can only issue sets of independent instructions.

These instructions take from two to five cycles of latency to execute, and a branch prediction mechanism is used to keep the pipeline filled. The diagram below shows a box for the interval between issue of each instruction and the completion.

<sup>4</sup>Readers with a background in theoretical physics may have seen this term in an other, unrelated, context.

Bold letters mark the critical latency paths of the instructions, that is, the periods between the required availability of the source registers and the earliest availability of the result registers. The A-L critical latency path is a special case, in which the result of the A instruction may be used as the base register of the L instruction without penalty. E instructions may require additional cycles of latency for certain operations, such as fixed-point multiply and divide, floating-point and digital signal processing operations.



Super-spring Pipeline

Terpsichore provides an additional refinement to the organization defined above, in which the time permitted by the pipeline to service load operations may be flexibly extended. Thus, the front of the pipeline, in which A, L and B type instructions are handled, is decoupled from the back of the pipeline, in which E, and S type instructions are handled. This decoupling occurs at the point at which the data cache and its backing memory are referenced; similarly, a FIFO that is filled by the instruction fetch unit decouples instruction cache references from the front of the pipeline shown above. The depth of the FIFO structures is implementation-dependent, i.e. not fixed by the architecture.

The diagram below indicates why we call this pipeline organization feature "super-spring," an extension of our super-string organization.



Highly Confidential

MU 0023251

With the super-spring organization, the latency of load instructions can be extended, so execute instructions are deferred until the results of the load are available. Nevertheless, the execution unit still processes instructions in normal order, and provides precise exceptions.



#### Branch/fetch Prediction

Terpsichore does not have delayed branch instructions, and so relies upon branch or fetch prediction to keep the pipeline full around unconditional and conditional branch instructions. The hardware prediction mechanism is tuned for optimizing conditional branches that close loops or express frequent alternatives, and will generally require substantially more cycles when executing conditional branches whose outcome is not predominately taken or not-taken. For such cases, the use of code which avoids conditional branches in favor of the use of set on compare and multiplex instructions may result in greater performance.

#### Additional Load and Execute Resources

MU 0023252

Studies of the dynamic distribution of Terpsichore instructions on the various benchmark suites indicate that the most frequently-issued instruction classes are load instructions and execute instructions. In a high-performance Terpsichore implementation, it is advantageous to consider execution pipelines in which the ability to target the machine resources toward issuing load and execute instructions is increased.

One of the means to increase the ability to issue execute-class instructions is to provide the means to issue two execute instructions in a single-issue string. The execution unit actually requires several distinct resources, so by partitioning these resources, the issue capability can be increased without increasing the number of functional units, other than the increased register file read and write ports. The partitioning favored for the initial implementation places all instructions that

## Instruction Set

All instructions are 32 bits in size, and use the high order 8 bits to specify a major operation code.

| 31    | 24 23 | 0 |
|-------|-------|---|
| major | other |   |
| 8     | 24    |   |

The major field is filled with a value specified by the following table:

| MAJOR | 8       | 32       | 64         | 96        | 128     | 160       | 192       | 224     |
|-------|---------|----------|------------|-----------|---------|-----------|-----------|---------|
| 0     | ARES    | ESETIE   | FMULADD16  | GMULADD1  | LU16LAI | SAAS64BAI | BFE16     | BE      |
| 1     |         | ESETINE  | FMULADD32  | GMULADD2  | LU16BAI | SAAS64BAI | BFNE16    | BNE     |
| 2     |         | ESETIL   | FMULADD64  | GMULADD4  | LU16LI  | SCAS64LAI | BFUE16    | BL      |
| 3     |         | ESETIGE  | FMULADD128 | GMULADD8  | LU16GI  | SCAS64BAI | BFNU16    | BGE     |
| 4     | AADDI   | EADDI    | FMULSUB16  | GMULADD16 | LU32LAI | SMAS64BAI | BFNUGE16  |         |
| 5     |         | EADDIS0  | FMULSUB32  | GMULADD32 | LU32BAI | SMAS64BAI | BFUGE16   |         |
| 6     |         | ESETIUL  | FMULSUB64  | GMULADD64 | LU32GI  | SMAS64BAI | BFUL16    | BUL     |
| 7     |         | ESETIUGE | FMULSUB128 |           | LU32GI  | SMUX64BAI | BFNU16V   | BUGE    |
| 8     |         | ESUBIE   |            |           | L16LAI  | S48LAI    | BFE32     | BANDE   |
| 9     |         | ESUBINE  |            |           | L16BAI  | S16BAI    | BFNE32    | BANDNE  |
| 10    |         | ESUBIL   |            |           | L16LI   | S16LI     | BFUE32    | BANDL   |
| 11    |         | ESUBIGE  |            |           | L16GI   | S16GI     | BFNU32    | BANDGE  |
| 12    | ASUBI   | ESUBI    | F16        | GMULADD16 | L32LAI  | S32LAI    | BFNUGE32  |         |
| 13    |         | ESUBI0   | F32        | GMULADD32 | L32BAI  | S32BAI    | BFUGE32   |         |
| 14    |         | ESUBI64  | F64        | GMULADD64 | L32GI   | S32GI     | BFUL32    | BANDG   |
| 15    |         | ESUBIRGE | F128       |           | L32GI   | S32GI     | BFNU32    | BANDLE  |
| 16    | AANDI   | EANDI    | FMULADD16  | GMULADD16 | L64LAI  | S64LAI    | BFE64     |         |
| 17    | AORI    | EORI     | FMULADD32  | GMULADD32 | L64BAI  | S64BAI    | BFNE64    |         |
| 18    | AXORI   | EXORI    | FMULADD64  | GMULADD64 | L64LI   | S64LI     | BFUE64    |         |
| 19    |         | AMUL     |            | GMUL      | L64GI   | S64GI     | BFNU64    |         |
| 20    | ANANDI  | ENANDI   | FMULSUB16  | G_EXTRACT | L128LAI | S128LAI   | BFNUGE64  |         |
| 21    | ANORI   | ENORI    | FMULSUB32  | G_EXTRACT | L128BAI | S128BAI   | BFUGE64   | CBGATEI |
| 22    |         | EADDI0   | FMULSUB64  | G_EXTRACT | L128LI  | S128LI    | BFUL64    |         |
| 23    |         | ESUBI0   | F128       |           | L128GI  | S128GI    | BFNU64    |         |
| 24    |         |          | F.16       | G.1       | LUBI    | S8I       | BFE128    |         |
| 25    |         |          | F.32       | G.2       | LUBI    |           | BFNE128   |         |
| 26    |         |          | F.64       | G.4       |         |           | BFUE128   |         |
| 27    |         |          | F.128      | G.8       |         |           | BFNU6328  |         |
| 28    | ACOPYI  | ECOPYI   | GF.16      | G.16      | BGA729  |           | BFNUGE128 | BI      |
| 29    |         |          | GF.32      | G.32      |         |           | BFUGE128  | BLINKI  |
| 30    |         |          | GF.64      | G.64      |         |           | BFUL128   |         |
| 31    | A.MINOR | E.MINOR  |            | Y.16      | L.MINOR | S.MINOR   | BFNU128   | B.MINOR |

major operation code field values

For the major operation field values A.MINOR, L.MINOR, E.MINOR, F.16, F.32, F.64, F.128, GF.16, GF.32, GF.64, G.1, G.2, G.4, G.8, G.16, G.32, G.64, S.MINOR and B.MINOR, the lowest-order six bits in the instruction specify a minor operation code:

| 31    | 24 23 | 0 | 6 5   | 0 |
|-------|-------|---|-------|---|
| major | other |   | minor |   |
| 8     | 18    |   | 6     |   |

<sup>5</sup>Blank table entries cause the Reserved Instruction exception to occur.

MU 0023254

# Terpsichore System Architecture

REDACTED

The minor field is filled with a value from one of the following tables:

| A.MINOR | 0    | 8    | 16    | 24 | 32 | 40 | 48 | 56     |
|---------|------|------|-------|----|----|----|----|--------|
| 0       |      |      | AAND  |    |    |    |    |        |
| 1       |      |      | AOR   |    |    |    |    |        |
| 2       |      |      | AXOR  |    |    |    |    |        |
| 3       |      |      | AANDN |    |    |    |    |        |
| 4       | AADD | ASUB | ANAND |    |    |    |    | ASHLI  |
| 5       |      |      | ANOR  |    |    |    |    |        |
| 6       |      |      | AXNOR |    |    |    |    | ASHRI  |
| 7       |      |      | AORN  |    |    |    |    | AUSHRI |

minor operation code field values for A.MINOR

| E.MINOR | 0       | 8       | 16    | 24                 | 32                   | 40                             | 48                  | 56        |
|---------|---------|---------|-------|--------------------|----------------------|--------------------------------|---------------------|-----------|
| 0       | ESETE   | ESUBE   | EAND  | E <del>SHL</del> 0 | EALMS                | E <del>W<del>SHR</del></del> 0 | E <del>SHR</del> 0  | ESILIG    |
| 1       | ESETNE  | ESUBNE  | EOR   | E <del>SHL</del> 0 | EASUM                | E <del>W<del>SHR</del></del> 0 | E <del>SHR</del> 0  | ECHLIVD   |
| 2       | ESETL   | ESUBL   | EXOR  | EEXPAND            |                      |                                |                     | EEXPANDI  |
| 3       | ESETGE  | ESUBGE  | EANDN | EUXPAND            |                      |                                |                     | EUEXPANDI |
| 4       | EADD    | ESUB    | ENAND | E <del>SHL</del> 0 |                      |                                |                     | ESHLI     |
| 5       | EADD\$0 | ESUB\$0 | ENOR  | E <del>SHL</del> 0 | E <del>SHR</del> 0   | E <del>SHR</del> 0             | E <del>SHR</del> 0  | ESHLI\$0  |
| 6       | ESETUL  | ESUBL   | EXNOR | E <del>SHR</del> 0 | E <del>GATHER</del>  | E <del>ROTATE</del>            | E <del>ROTATE</del> | ESHRI     |
| 7       | ESETGE  | ESUBGE  | EORN  | E <del>SHR</del> 0 | E <del>SCATTER</del> | E <del>ROTATE</del>            | E <del>ROTATE</del> | EUSHRI    |

minor operation code field values for E.MINOR

| F.size | 0         | 8         | 16        | 24        | 32      | 40        | 48       | 56       |
|--------|-----------|-----------|-----------|-----------|---------|-----------|----------|----------|
| 0      | FADD.N    | FADD      | FADD.F    | FADD.C    | FADD    | FADD.X    | FSETE    | FSETEX   |
| 1      | FSUB.N    | FSUB      | FSUB.F    | FSUB.C    | FSUB    | FSUB.X    | FSETNE   | FSETNEX  |
| 2      | FMUL.N    | FMUL      | FMUL.F    | FMUL.C    | FMUL    | FMUL.X    | FSETUE   | FSETUEX  |
| 3      | FDIV.N    | FDIV      | FDIV.F    | FDIV.C    | FDIV    | FDIV.X    | FSETNUE  | FSETNUEX |
| 4      | F.UNARY.N | F.UNARY.T | F.UNARY.F | F.UNARY.C | F.UNARY | F.UNARY.X | FSETNUGE | FSETNLX  |
| 5      |           |           |           |           |         |           | FSETUGE  | FSETNLX  |
| 6      |           |           |           |           |         |           | FSETUL   | FSETNGEX |
| 7      |           |           |           |           |         |           | FSETNUL  | FSETGEX  |

minor operation code field values for F.size

| GF.size | 0          | 8          | 16         | 24         | 32       | 40         | 48        | 56        |
|---------|------------|------------|------------|------------|----------|------------|-----------|-----------|
| 0       | GFADD.N    | GFADD      | GFADD.F    | GFADD.C    | GFADD    | GFADD.X    | GFSETE    | GFSETEX   |
| 1       | GFSUB.N    | GFSUB      | GFSUB.F    | GFSUB.C    | GFSUB    | GFSUB.X    | GFSETNE   | GFSETNEX  |
| 2       | GFMUL.N    | GFMUL      | GFMUL.F    | GFMUL.C    | GFMUL    | GFMUL.X    | GFSETUE   | GFSETUEX  |
| 3       | GFDIV.N    | GFDIV      | GFDIV.F    | GFDIV.C    | GFDIV    | GFDIV.X    | GFSETNUE  | GFSETNUEX |
| 4       | GF.UNARY.N | GF.UNARY.T | GF.UNARY.F | GF.UNARY.C | GF.UNARY | GF.UNARY.X | GFSETNUGE | GFSETNLX  |
| 5       |            |            |            |            |          |            | GFSETUGE  | GFSETNLX  |
| 6       |            |            |            |            |          |            | GFSETUL   | GFSETNGEX |
| 7       |            |            |            |            |          |            | GFSETNUL  | GFSETGEX  |

minor operation code field values for GF.size

| G.size | 0      | 8    | 16    | 24                  | 32                   | 40                  | 48               | 56               |
|--------|--------|------|-------|---------------------|----------------------|---------------------|------------------|------------------|
| 0      | GSETE  |      | GAND  | G <del>SHL</del>    | G <del>COPY</del>    | G <del>ROTATE</del> | GMUL             |                  |
| 1      | GSETNE |      | GOR   | G <del>SHL</del>    | G <del>SHR</del>     | G <del>ROTATE</del> | GUMUL            | GCOMPRESS        |
| 2      | GSETL  |      | GXOR  | G <del>EXPAND</del> | G <del>COPY</del>    |                     | GDIV             | GEXPANDI         |
| 3      | GSETGE |      | GANDN | G <del>EXPAND</del> | G <del>SHR</del>     |                     | GUDIV            | GUEXPANDI        |
| 4      | GADD   | GSUB | GNAND | G <del>SHL</del>    | G <del>SHL</del>     | G <del>ROTATE</del> | G <del>SHR</del> | G <del>SHR</del> |
| 5      |        |      | GNOR  | G <del>SHR</del>    | G <del>SHR</del>     | G <del>ROTATE</del> | G <del>SHR</del> | G <del>SHR</del> |
| 6      | GSETUL |      | GXNOR | G <del>SHR</del>    | G <del>GATHER</del>  | G <del>ROTATE</del> | G <del>SHR</del> | G <del>SHR</del> |
| 7      | GSETGE |      | GORN  | G <del>SHR</del>    | G <del>SCATTER</del> | G <del>ROTATE</del> | G <del>SHR</del> | G <del>SHR</del> |

minor operation code field values for G.size

Highly Confidential

MU 0023255

```

FloatingPoint(minor.op, major.size, minor.round, ra, rb, rc)
F.UNARY.N, F.UNARY.T, F.UNARY.F, F.UNARY.C,
F.UNARY, F.UNARY.X:
  case unary of
    F.ABS, F.NEG, F.SQR,
    F.HALF, F.SINGLE, F.DOUBLE, F.QUAD,
    F.INT, F.FLOAT:
      FloatingPointUnary(unary.op, major.size, minor.round,
                         ra, rc)
  others:
    raise ReservedInstruction
  endcase
  others:
    raise ReservedInstruction
  endcase
GMULADD1, GMULADD2, GMULADD4,
GMULADD8, GMULADD16, GMULADD32,
GUMULADD2, GUMULADD4,
GUMULADD8, GUMULADD16, GUMULADD32,
GMUX, GMUXGATHER, GSCATTERMUX, G.EXTRACT.128:
  GroupTernary(major.size, ra, rb, rc, rd)
  G.EXTRACT.I, G.EXTRACT.I.64:
  GroupExtractImmediate(major.ra, rb, rc, minor)
  G.1, G.2, G.4, G.8, G.16, G.32:
  case minor of
    G.SHL, G.SHR, G.USHR, G.ADD, G.SUB, G.MUL, G.UMUL,
    G.AND, G.OR, G.XOR, G.ANDN, G.NAND, G.NOR, G.XNOR, G.ORN,
    G.SET.E, G.SET.NE, G.SET.L, G.SET.GE, G.SET.UL, G.SET.UGE,
    G.COPY, G.SWAP, G.DEPAL, G.SKUFL, G.COMPRESS, G.EXPAND,
    G.GATHER, G.SCATTER:
      GroupMinorTernary(major.ra, rb, rc)
      G.COMPRESS.I, G.EXPAND.I, G.SHL.I, G.SHR.I, G.USHR.I:
      GroupShortImmediate(minor, major.ra, simm, rc)
      G.EXTRACT.I:
      GroupExtractImmediate(major.ra, rb, rc, minor)
  others:
    raise ReservedInstruction
  endcase
  GFMULADD16, GFMULADD32, GFMULADD64,
  GFMULSUB16, GFMULSUB32, GFMULSUB64:
  GroupFloatingPointTernary(major.ra, rb, rc, rd)
  GF.16, GF.32, GF.64, GF.128:
  case minor of
    GF.ADD.N, GF.SUB.N, GF.MUL.N, GF.DIV.N,
    GF.ADD.T, GF.SUB.T, GF.MULT, GF.DIV.T,
    GF.ADD.F, GF.SUB.F, GF.MUL.F, GF.DIV.F,
    GF.ADD.C, GF.SUB.C, GF.MUL.C, GF.DIV.C,
    GF.ADD, GF.SUB, GF.MUL, GF.DIV,
    GF.ADD.X, GF.SUB.X, GF.MUL.X, GF.DIV.X,
    GF.SET.E, GF.SET.NE, GF.SET.UE, GF.SET.NUE,
    GF.SET.NUGE, GF.SET.UGE, GF.SET.UL, GF.SET.NUL,
    GF.SET.E.X, GF.SET.NE.X, GF.SET.UE.X, GF.SET.NUE.X,
    GF.SET.L.X, GF.SET.NL.X, GF.SET.NGE.X, GF.SET.GE.X:
      GroupFloatingPoint(minor.op, major.size, minor.round, ra, rb, rc)
      GF.UNARY.N, GF.UNARY.T, GF.UNARY.F, GF.UNARY.C,
      GF.UNARY, GF.UNARY.X:
        case unary of
          GF.ABS, GF.NEG, GF.SQR,

```

Group

These operations take two values from a pair of registers, perform operations on groups of bits in the operands, and place the concatenated results in a register.

Operation codes

|                      |                              |
|----------------------|------------------------------|
| G.ADD.2              | Group add pecks              |
| G.ADD.4              | Group add nibbles            |
| G.ADD.8              | Group add bytes              |
| G.ADD.16             | Group add doublets           |
| G.ADD.32             | Group add quadlets           |
| G.ADD.64             | Group add octlets            |
| G.AND <sup>10</sup>  | Group and                    |
| G.ANDN <sup>11</sup> | Group and not                |
| G.COMPRESS.1         | Group compress bits          |
| G.COMPRESS.2         | Group compress pecks         |
| G.COMPRESS.4         | Group compress nibbles       |
| G.COMPRESS.8         | Group compress bytes         |
| G.COMPRESS.16        | Group compress doublets      |
| G.COMPRESS.32        | Group compress quadlets      |
| G.COMPRESS.64        | Group compress octlets       |
| G.COPY.1             | Group copy bits              |
| G.COPY.2             | Group copy pecks             |
| G.COPY.4             | Group copy nibbles           |
| G.COPY.8             | Group copy bytes             |
| G.COPY.16            | Group copy doublets          |
| G.COPY.32            | Group copy quadlets          |
| G.COPY.64            | Group copy octlets           |
| G.DEAL               | Group deal bits              |
| G.DEAL.2             | Group deal pecks             |
| G.DEAL.4             | Group deal nibbles           |
| G.DEAL.8             | Group deal bytes             |
| G.DEAL.16            | Group deal doublets          |
| G.DEAL.32            | Group deal quadlets          |
| G.DIV.64             | Group signed divide octlets  |
| G.EXPAND.1           | Group signed expand bits     |
| G.EXPAND.2           | Group signed expand pecks    |
| G.EXPAND.4           | Group signed expand nibbles  |
| G.EXPAND.8           | Group signed expand bytes    |
| G.EXPAND.16          | Group signed expand doublets |
| G.EXPAND.32          | Group signed expand quadlets |
| G.EXPAND.64          | Group signed expand octlet   |

MU 0023308

<sup>10</sup>G.AND does not require a size specification, and is encoded as G.AND.1.<sup>11</sup>G.ANDN does not require a size specification, and is encoded as G.ANDN.1. G.ANDN is used as the encoding for G.SET.L.1, and by reversing the operands, for G.SET.UL.1.

|                             |                                  |
|-----------------------------|----------------------------------|
| G.GATHER.2                  | Group gather pecks               |
| G.GATHER.4                  | Group gather nibbles             |
| G.GATHER.8                  | Group gather bytes               |
| G.GATHER.16                 | Group gather doublets            |
| G.GATHER.32                 | Group gather quadlets            |
| G.GATHER.64                 | Group gather octlets             |
| G.GATHER.128 <sup>12</sup>  | Group gather hexlets             |
| G.MUL.1 <sup>13</sup>       | Group signed multiply bits       |
| G.MUL.2                     | Group signed multiply pecks      |
| G.MUL.4                     | Group signed multiply nibbles    |
| G.MUL.8                     | Group signed multiply bytes      |
| G.MUL.16                    | Group signed multiply doublets   |
| G.MUL.32                    | Group signed multiply quadlets   |
| G.MUL.64                    | Group signed multiply octlets    |
| G.NAND <sup>14</sup>        | Group nand                       |
| G.NOR <sup>15</sup>         | Group nor                        |
| G.OR <sup>16</sup>          | Group or                         |
| G.ORN <sup>17</sup>         | Group or not                     |
| G.POLY.1                    | Group polynomial divide bits     |
| G.POLY.2                    | Group polynomial divide pecks    |
| G.POLY.4                    | Group polynomial divide nibbles  |
| G.POLY.8                    | Group polynomial divide bytes    |
| G.POLY.16                   | Group polynomial divide doublets |
| G.POLY.32                   | Group polynomial divide quadlets |
| G.POLY.64                   | Group polynomial divide octlets  |
| G.SCATTER.2                 | Group scatter pecks              |
| G.SCATTER.4                 | Group scatter nibbles            |
| G.SCATTER.8                 | Group scatter bytes              |
| G.SCATTER.16                | Group scatter doublets           |
| G.SCATTER.32                | Group scatter quadlets           |
| G.SCATTER.64                | Group scatter octlets            |
| G.SCATTER.128 <sup>18</sup> | Group scatter hexlet             |
| G.SHL.2                     | Group shift left pecks           |
| G.SHL.4                     | Group shift left nibbles         |
| G.SHL.8                     | Group shift left bytes           |
| G.SHL.16                    | Group shift left doublets        |
| G.SHL.32                    | Group shift left quadlets        |
| G.SHL.64                    | Group shift left octlets         |

MU 0023309

<sup>12</sup>G.GATHER.128 is encoded as G.GATHER.1<sup>13</sup>G.MUL.1 is used as the encoding for G.UMUL.1.<sup>14</sup>G.NAND does not require a size specification, and is encoded as G.NAND.1.<sup>15</sup>G.NOR does not require a size specification, and is encoded as G.NOR.1.<sup>16</sup>G.OR does not require a size specification, and is encoded as G.OR.1.<sup>17</sup>G.ORN does not require a size specification, and is encoded as G.ORN.1. G.ORN is used as the encoding for G.SET.UGE.1, and by reversing the operands, for G.SET.GE.1.<sup>18</sup>G.SCATTER.128 is encoded as G.SCATTER.1

Highly Confidential

|                      |                                     |
|----------------------|-------------------------------------|
| G.SHR.2              | Group signed shift right pecks      |
| G.SHR.4              | Group signed shift right nibbles    |
| G.SHR.8              | Group signed shift right bytes      |
| G.SHR.16             | Group signed shift right doublets   |
| G.SHR.32             | Group signed shift right quadlets   |
| G.SHR.64             | Group signed shift right octlets    |
| G.SHUFFLE.1          | Group shuffle bits                  |
| G.SHUFFLE.2          | Group shuffle pecks                 |
| G.SHUFFLE.4          | Group shuffle nibbles               |
| G.SHUFFLE.8          | Group shuffle bytes                 |
| G.SHUFFLE.16         | Group shuffle doublets              |
| G.SHUFFLE.32         | Group shuffle quadlets              |
| G.SWAP.1             | Group swap bits                     |
| G.SWAP.2             | Group swap pecks                    |
| G.SWAP.4             | Group swap nibbles                  |
| G.SWAP.8             | Group swap bytes                    |
| G.SWAP.16            | Group swap doublets                 |
| G.SWAP.32            | Group swap quadlets                 |
| G.U.DIV.64           | Group signed divide octlets         |
| G.U.EXPAND.1         | Group unsigned expand bits          |
| G.U.EXPAND.2         | Group unsigned expand pecks         |
| G.U.EXPAND.4         | Group unsigned expand nibbles       |
| G.U.EXPAND.8         | Group unsigned expand bytes         |
| G.U.EXPAND.16        | Group unsigned expand doublets      |
| G.U.EXPAND.32        | Group unsigned expand quadlets      |
| G.U.EXPAND.64        | Group unsigned expand octlets       |
| G.U.MUL.2            | Group unsigned multiply pecks       |
| G.U.MUL.4            | Group unsigned multiply nibbles     |
| G.U.MUL.8            | Group unsigned multiply bytes       |
| G.U.MUL.16           | Group unsigned multiply doublets    |
| G.U.MUL.32           | Group unsigned multiply quadlets    |
| G.U.MUL.64           | Group unsigned multiply octlets     |
| G.U.SHR.2            | Group unsigned shift right pecks    |
| G.U.SHR.4            | Group unsigned shift right nibbles  |
| G.U.SHR.8            | Group unsigned shift right bytes    |
| G.U.SHR.16           | Group unsigned shift right doublets |
| G.U.SHR.32           | Group unsigned shift right quadlets |
| G.U.SHR.64           | Group unsigned shift right octlets  |
| G.XNOR <sup>19</sup> | Group exclusive-nor                 |
| G.XOR <sup>20</sup>  | Group exclusive-or                  |

MU 0023310

<sup>19</sup>G.XNOR does not require a size specification, and is encoded as G.XNOR.1. G.XNOR is used as the encoding for G.SET.E.1.

<sup>20</sup>G.XOR does not require a size specification, and is encoded as G.XOR.1. G.XOR is used as the encoding for G.ADD.1, G.SUB.1 and G.SET.NE.1.

Highly Confidential

| class             | op                                   | size               | size             |
|-------------------|--------------------------------------|--------------------|------------------|
| linear            | ADD                                  | 2 4 8 16 32 64     |                  |
| bitwise           | AND ANDN NAND NOR<br>OR ORN XNOR XOR |                    |                  |
| signed multiply   | MUL                                  | 1 2 4 8 16 32 64   |                  |
| unsigned multiply | U.MUL                                | 2 4 8 16 32 64     |                  |
| signed divide     | DIV                                  |                    | 64               |
| unsigned divide   | U.DIV                                |                    | 64               |
| rearrange         | COPY<br>SWAP                         | DEAL<br>SHUFFLE    | 1 2 4 8 16 32    |
|                   | GATHER                               | SCATTER            | 2 4 8 16 32 64   |
| galois field      | POLY                                 |                    | 2 4 8 16 32 64   |
| precision         | COMPRESS                             | EXPAND<br>U.EXPAND | 1 2 4 8 16 32 64 |
| shift             | SHL SHR U.SHR                        |                    | 2 4 8 16 32 64   |

Format

G.op.size

rc=ra,rb

Description

Two values are taken from the contents of registers ra and rb. The specified operation is performed, and the result is placed in register rc.

Definition

def Group(op, size, ra, rb, rc)

case op of

G.MUL, G.U.MUL, G.DIV, G.U.DIV:

a ← REG[ra]

b ← REG[rb]

G.ADD, G.SUB, G.SET.L, G.SET.UL, G.SET.E, G.SET.NE, G.SET.GE, G.SET.UGE,  
G.AND, G.OR, G.XOR, G.ANDN, G.NAND, G.NOR, G.XNOR, G.ORN,  
G.GATHER, G.SCATTER:

a ← REG[ra]

b ← REG[rb]

G.COMPRESS, G.SHL, G.SHR, G.U.SHR, G.POLY:

a ← REG[ra]

b ← REG[rb]

G.EXPAND, G.U.EXPAND:

a ← REG[ra]

b ← REG[rb]

G.COPY, G.SWAP, G.DEAL, G.SHUFFLE:

a ← REG[ra] || REG[rb]

endcase

MU 0023311

Highly Confidential

```

case op of
  G.ADD:
    for i ← 0 to 128-size by size
       $c_{i+size-1..i} \leftarrow a_{i+size-1..i} + b_{i+size-1..i}$ 
    endfor
  G.MUL:
    for i ← 0 to 64-size by size
       $c_{2^i(i+size)-1..2^i i} \leftarrow (a_{size-1..size} \parallel a_{size-1+i..i}) * (b_{size-1..size} \parallel b_{size-1+i..i})$ 
    endfor
  G.U.MUL:
    for i ← 0 to 64-size by size
       $c_{2^i(i+size)-1..2^i i} \leftarrow (0_{size} \parallel a_{size-1+i..i}) * (0_{size} \parallel b_{size-1+i..i})$ 
    endfor
  G.DIV:
    if (b = 0) or ( (a = (1||063)) and (b = 164) ) then
      c ← undefined
    else
      q ← a / b
      r ← a - q * b
      c ← r63..0 || q63..0
    endif
  G.U.DIV:
    if b = 0 then
      c ← undefined
    else
      q ← (0 || a) / (0 || b)
      r ← a - q * b
      c ← r63..0 || q63..0
    endif
  G.AND:
    c ← a and b
  G.OR:
    c ← a or b
  G.XOR:
    c ← a xor b
  G.ANDN:
    c ← a and not b
  G.NAND:
    c ← not (a and b)
  G.NOR:
    c ← not (a or b)
  G.XNOR:
    c ← not (a xor b)
  G.ORN:
    c ← a or not b
  G.POLY:
    p[0] ← a
    for i ← 1 to size
      p[i] ← (p[i-1]0..0 ? (064 || b) : 0128) xor (p[i-1]0..0 || p[i-1]127..1)
    endfor
    c ← p[size]
  G.GATHER:
    for k ← 0 to 128-size by size
      j ← k
      for i ← k to k+size-1 by 1
        if ai..i then
          cj..j ← bi..i

```

MU 0023312

Highly Confidential

```

        j ← j + 1
    endif
endfor
j ← k+size-1
for i ← k+size-1 to k by -1
    if ~ai then
        cj ← bi
        j ← j - 1
    endif
endfor
endfor
G.SCATTER:
for k ← 0 to 128-size by size
    j ← k
    for i ← k to k+size-1 by 1
        if aj then
            cj ← bj
            j ← j + 1
        endif
    endfor
    j ← k+size-1
    for i ← k+size-1 to k by -1
        if ~ai then
            cj ← bi
            j ← j - 1
        endif
    endfor
endfor
G.COMPRESS:
for i ← 0 to 64-size by size
    ci+size-1..i ← ai+i+size-1-(b&(size-1))..i+(b&(size-1))
endfor
G.EXPAND:
for i ← 0 to 64-size by size
    ci+size+size-1..i+i ← ai+size-1..i+(b&(size-1)) || 0b&(size-1)
endfor
G.U.EXPAND:
for i ← 0 to 64-size by size
    ci+size+size-1..i+i ← 0size-(b&(size-1)) || ai+size-1..i+(b&(size-1)) || 0b&(size-1)
endfor
G.SHL:
for i ← 0 to 128-size by size
    ci+size-1..i ← ai+size-1-(b&(size-1))..i || 0b&(size-1)
endfor
G.SHR:
for i ← 0 to 128-size by size
    ci+size-1..i ← ai+size-1..i+(b&(size-1)) || ai+size-1..i+(b&(size-1))
endfor
G.U.SHR:
for i ← 0 to 128-size by size
    ci+size-1..i ← 0b&(size-1) || ai+size-1..i+(b&(size-1))
endfor
G.COPY:
for i ← 0 to 128-size by size
    ci+size-1..i ← ai+size-1..0

```

MU 0023313

Highly Confidential

```
    endfor
G.SWAP:
  for i ← 0 to 128-size by size
    Ci+size-1..i ← a127-i..128-size-i
  endfor
G.DEAL:
  for i ← 0 to 128-size by size
    j ← (i5..0 || 01) + (i6 ? size : 0)
    Ci+size-1..i ← aj+size-1..j
  endfor
G.SHUFFLE:
  for i ← 0 to 128-size by size
    j ← (01 || i6..1) + ((i&size) ? (64-(01 || size6..1)) : 0)
    Ci+size-1..i ← aj+size-1..j
  endfor
endcase
REG[rc] ← c
enddef
```

Exceptions

Reserved Instruction

REGISTERED  
MICROUNITY  
DO NOT REPRODUCE  
COPY #247  
Final Test

MU 0023314

**Highly Confidential**